24 research outputs found
A Developmental Model of Trust in Humanoid Robots
Trust between humans and artificial systems has recently received increased attention due to the widespread use of autonomous systems in our society. In this context trust plays a dual role. On the one hand it is necessary to build robots that are perceived as trustworthy by humans. On the other hand we need to give to those robots the ability to discriminate between reliable and unreliable informants. This thesis focused on the second problem, presenting an interdisciplinary investigation of trust, in particular a computational model based on neuroscientific and psychological assumptions. First of all, the use of Bayesian networks for modelling causal relationships was investigated. This approach follows the well known theory-theory framework of the Theory of Mind (ToM) and an established line of research based on the Bayesian description of mental processes. Next, the role of gaze in human-robot interaction has been investigated. The results of this research were used to design a head pose estimation system based on Convolutional Neural Networks. The system can be used in robotic platforms to facilitate joint attention tasks and enhance trust. Finally, everything was integrated into a structured cognitive architecture. The architecture is based on an actor-critic reinforcement learning framework and an intrinsic motivation feedback given by a Bayesian network. In order to evaluate the model, the architecture was embodied in the iCub humanoid robot and used to replicate a developmental experiment. The model provides a plausible description of children's reasoning that sheds some light on the underlying mechanism involved in trust-based learning. In the last part of the thesis the contribution of human-robot interaction research is discussed, with the aim of understanding the factors that influence the establishment of trust during joint tasks. Overall, this thesis provides a computational model of trust that takes into account the development of cognitive abilities in children, with a particular emphasis on the ToM and the underlying neural dynamics.THRIVE, Air Force Office of Scientific Research, Award No. FA9550-15-1-002
Emotion Recognition in the Wild using Deep Neural Networks and Bayesian Classifiers
Group emotion recognition in the wild is a challenging problem, due to the
unstructured environments in which everyday life pictures are taken. Some of
the obstacles for an effective classification are occlusions, variable lighting
conditions, and image quality. In this work we present a solution based on a
novel combination of deep neural networks and Bayesian classifiers. The neural
network works on a bottom-up approach, analyzing emotions expressed by isolated
faces. The Bayesian classifier estimates a global emotion integrating top-down
features obtained through a scene descriptor. In order to validate the system
we tested the framework on the dataset released for the Emotion Recognition in
the Wild Challenge 2017. Our method achieved an accuracy of 64.68% on the test
set, significantly outperforming the 53.62% competition baseline.Comment: accepted by the Fifth Emotion Recognition in the Wild (EmotiW)
Challenge 201
Sim-to-Real Quadrotor Landing via Sequential Deep Q-Networks and Domain Randomization
The autonomous landing of an Unmanned Aerial Vehicle (UAV) on a marker is one of the most challenging problems in robotics. Many solutions have been proposed, with the best results achieved via customized geometric features and external sensors. This paper discusses for the first time the use of deep reinforcement learning as an end-to-end learning paradigm to find a policy for UAVs autonomous landing. Our method is based on a divide-and-conquer paradigm that splits a task into sequential sub-tasks, each one assigned to a Deep Q-Network (DQN), hence the name Sequential Deep Q-Network (SDQN). Each DQN in an SDQN is activated by an internal trigger, and it represents a component of a high-level control policy, which can navigate the UAV towards the marker. Different technical solutions have been implemented, for example combining vanilla and double DQNs, and the introduction of a partitioned buffer replay to address the problem of sample efficiency. One of the main contributions of this work consists in showing how an SDQN trained in a simulator via domain randomization, can effectively generalize to real-world scenarios of increasing complexity. The performance of SDQNs is comparable with a state-of-the-art algorithm and human pilots while being quantitatively better in noisy conditions
Comparing the Efficacy of Fine-Tuning and Meta-Learning for Few-Shot Policy Imitation
In this paper we explore few-shot imitation learning for control problems,
which involves learning to imitate a target policy by accessing a limited set
of offline rollouts. This setting has been relatively under-explored despite
its relevance to robotics and control applications. State-of-the-art methods
developed to tackle few-shot imitation rely on meta-learning, which is
expensive to train as it requires access to a distribution over tasks (rollouts
from many target policies and variations of the base environment). Given this
limitation we investigate an alternative approach, fine-tuning, a family of
methods that pretrain on a single dataset and then fine-tune on unseen
domain-specific data. Recent work has shown that fine-tuners outperform
meta-learners in few-shot image classification tasks, especially when the data
is out-of-domain. Here we evaluate to what extent this is true for control
problems, proposing a simple yet effective baseline which relies on two stages:
(i) training a base policy online via reinforcement learning (e.g. Soft
Actor-Critic) on a single base environment, (ii) fine-tuning the base policy
via behavioral cloning on a few offline rollouts of the target policy. Despite
its simplicity this baseline is competitive with meta-learning methods on a
variety of conditions and is able to imitate target policies trained on unseen
variations of the original environment. Importantly, the proposed approach is
practical and easy to implement, as it does not need any complex meta-training
protocol. As a further contribution, we release an open source dataset called
iMuJoCo (iMitation MuJoCo) consisting of 154 variants of popular OpenAI-Gym
MuJoCo environments with associated pretrained target policies and rollouts,
which can be used by the community to study few-shot imitation learning and
offline reinforcement learning